Used for prediction and inference when response is quantitative.
Applied when the relation bentwen the response and predictor(s) is assumed to be close to linear.
A linear model with one predictor is referred to as the Simple Linear model and that with more than one predictors is called Multiple Linear Model.
Simple Linear Model
Simple Linear Model
A perfect linear relationship is unrealistic for any natural process.
So, it is not possible to predict the exact value of \(y\) just by knowing \(x\).
Ex: Family income (\(x\)) and financial support (\(y\)) to a student by a college.
This doesn’t mean one cannot make a reasonably good estimate of \(y\) using \(x\).
Simple Linear Model
The relationship between \(x\) and \(y\) can be modeled as a straight line with some error:
\(y = b_0 + b_1x + \epsilon\)
\(b_0\) is the intercept and \(b_1\) is the slope of the line. Error is represented by \(\epsilon\)
Possums
site
pop
sex
age
head_l
skull_w
total_l
tail_l
1
Vic
m
8
94.1
60.4
89.0
36.0
1
Vic
f
6
92.5
57.6
91.5
36.5
1
Vic
f
6
94.0
60.0
95.5
39.0
1
Vic
f
6
93.2
57.1
92.0
38.0
1
Vic
f
2
91.5
56.3
85.5
36.0
1
Vic
f
1
93.1
54.8
90.5
35.5
1
Vic
m
2
95.3
58.2
89.5
36.0
1
Vic
f
6
94.8
57.6
91.0
37.0
1
Vic
f
9
93.4
56.3
91.5
37.0
1
Vic
f
6
91.8
58.0
89.5
37.5
1
Vic
f
9
93.3
57.2
89.5
39.0
1
Vic
f
5
94.9
55.6
92.0
35.5
1
Vic
m
5
95.1
59.9
89.5
36.0
1
Vic
m
3
95.4
57.6
91.5
36.0
1
Vic
m
5
92.9
57.6
85.5
34.0
1
Vic
m
4
91.6
56.0
86.0
34.5
1
Vic
f
1
94.7
67.7
89.5
36.5
1
Vic
m
2
93.5
55.7
90.0
36.0
1
Vic
f
5
94.4
55.4
90.5
35.0
1
Vic
f
4
94.8
56.3
89.0
38.0
1
Vic
f
3
95.9
58.1
96.5
39.5
1
Vic
m
3
96.3
58.5
91.0
39.5
1
Vic
f
4
92.5
56.1
89.0
36.0
1
Vic
m
2
94.4
54.9
84.0
34.0
1
Vic
m
3
95.8
58.5
91.5
35.5
1
Vic
m
7
96.0
59.0
90.0
36.0
1
Vic
f
2
90.5
54.5
85.0
35.0
1
Vic
m
4
93.8
56.8
87.0
34.5
1
Vic
f
3
92.8
56.0
88.0
35.0
1
Vic
f
2
92.1
54.4
84.0
33.5
1
Vic
m
3
92.8
54.1
93.0
37.0
1
Vic
f
4
94.3
56.7
94.0
39.0
1
Vic
m
3
91.4
54.6
89.0
37.0
2
Vic
m
2
90.6
55.7
85.5
36.5
2
Vic
m
4
94.4
57.9
85.0
35.5
2
Vic
m
7
93.3
59.3
88.0
35.0
2
Vic
f
2
89.3
54.8
82.5
35.0
2
Vic
m
7
92.4
56.0
80.5
35.5
2
Vic
f
1
84.7
51.5
75.0
34.0
2
Vic
f
3
91.0
55.0
84.5
36.0
2
Vic
f
5
88.4
57.0
83.0
36.5
2
Vic
m
3
85.3
54.1
77.0
32.0
2
Vic
f
2
90.0
55.5
81.0
32.0
2
Vic
m
NA
85.1
51.5
76.0
35.5
2
Vic
m
3
90.7
55.9
81.0
34.0
2
Vic
m
NA
91.4
54.4
84.0
35.0
3
other
m
2
90.1
54.8
89.0
37.5
3
other
m
5
98.6
63.2
85.0
34.0
3
other
m
4
95.4
59.2
85.0
37.0
3
other
f
5
91.6
56.4
88.0
38.0
3
other
f
5
95.6
59.6
85.0
36.0
3
other
m
6
97.6
61.0
93.5
40.0
3
other
f
3
93.1
58.1
91.0
38.0
4
other
m
7
96.9
63.0
91.5
43.0
4
other
m
2
103.1
63.2
92.5
38.0
4
other
m
3
99.9
61.5
93.7
38.0
4
other
f
4
95.1
59.4
93.0
41.0
4
other
m
3
94.5
64.2
91.0
39.0
4
other
m
2
102.5
62.8
96.0
40.0
4
other
f
2
91.3
57.7
88.0
39.0
5
other
m
7
95.7
59.0
86.0
38.0
5
other
f
3
91.3
58.0
90.5
39.0
5
other
f
6
92.0
56.4
88.5
38.0
5
other
f
3
96.9
56.5
89.5
38.5
5
other
f
5
93.5
57.4
88.5
38.0
5
other
f
3
90.4
55.8
86.0
36.5
5
other
m
4
93.3
57.6
85.0
36.5
5
other
m
5
94.1
56.0
88.5
38.0
5
other
m
5
98.0
55.6
88.0
37.5
5
other
f
7
91.9
56.4
87.0
38.0
5
other
m
6
92.8
57.6
90.0
40.0
5
other
m
1
85.9
52.4
80.5
35.0
5
other
m
1
82.5
52.3
82.0
36.5
6
other
f
4
88.7
52.0
83.0
38.0
6
other
m
6
93.8
58.1
89.0
38.0
6
other
m
5
92.4
56.8
89.0
41.0
6
other
m
6
93.6
56.2
84.0
36.0
6
other
m
1
86.5
51.0
81.0
36.5
6
other
m
1
85.8
50.0
81.0
36.5
6
other
m
1
86.7
52.6
84.0
38.0
6
other
m
3
90.6
56.0
85.5
38.0
6
other
f
4
86.0
54.0
82.0
36.5
6
other
f
3
90.0
53.8
81.5
36.0
6
other
m
3
88.4
54.6
80.5
36.0
6
other
m
3
89.5
56.2
92.0
40.5
6
other
f
3
88.2
53.2
86.5
38.5
7
other
m
2
98.5
60.7
93.0
41.5
7
other
f
2
89.6
58.0
87.5
38.0
7
other
m
6
97.7
58.4
84.5
35.0
7
other
m
3
92.6
54.6
85.0
38.5
7
other
m
3
97.8
59.6
89.0
38.0
7
other
m
2
90.7
56.3
85.0
37.0
7
other
m
3
89.2
54.0
82.0
38.0
7
other
m
7
91.8
57.6
84.0
35.5
7
other
m
4
91.6
56.6
88.5
37.5
7
other
m
4
94.8
55.7
83.0
38.0
7
other
m
3
91.0
53.1
86.0
38.0
7
other
m
5
93.2
68.6
84.0
35.0
7
other
f
3
93.3
56.2
86.5
38.5
7
other
m
1
89.5
56.0
81.5
36.5
7
other
m
1
88.6
54.7
82.5
39.0
7
other
f
6
92.4
55.0
89.0
38.0
7
other
m
4
91.5
55.2
82.5
36.5
7
other
f
3
93.6
59.9
89.0
40.0
Possums
Possums
The equation of the line he have is:
\(\hat{y} = 42.7 + 0.573x\)
For possums of total length 85 cm, we estimate the average head length to be:
\(\hat{y} = 42.7 + 0.573*(85) = 91.405\)
Residuals
head_l
.fitted
.resid
94.1
93.69801
0.4019925
92.5
95.13026
-2.6302607
94.0
97.42187
-3.4218658
93.2
95.41671
-2.2167113
91.5
91.69285
-0.1928530
93.1
94.55736
-1.4573594
95.3
93.98446
1.3155419
94.8
94.84381
-0.0438100
93.4
95.13026
-1.7302607
91.8
93.98446
-2.1844581
93.3
93.98446
-0.6844581
94.9
95.41671
-0.5167113
95.1
93.98446
1.1155419
95.4
95.13026
0.2697393
92.9
91.69285
1.2071470
91.6
91.97930
-0.3793036
94.7
93.98446
0.7155419
93.5
94.27091
-0.7709087
94.4
94.55736
-0.1573594
94.8
93.69801
1.1019925
95.9
97.99477
-2.0947671
96.3
94.84381
1.4561900
92.5
93.69801
-1.1980075
94.4
90.83350
3.5664990
95.8
95.13026
0.6697393
96.0
94.27091
1.7290913
90.5
91.40640
-0.9064023
93.8
92.55220
1.2477951
92.8
93.12511
-0.3251062
92.1
90.83350
1.2664990
92.8
95.98961
-3.1896126
94.3
96.56251
-2.2625139
91.4
93.69801
-2.2980075
90.6
91.69285
-1.0928530
94.4
91.40640
2.9935977
93.3
93.12511
0.1748938
89.3
89.97415
-0.6741491
92.4
88.82835
3.5716535
84.7
85.67739
-0.9773895
91.0
91.11995
-0.1199517
88.4
90.26060
-1.8605997
85.3
86.82319
-1.5231920
90.0
89.11480
0.8852028
85.1
86.25029
-1.1502908
90.7
89.11480
1.5852028
91.4
90.83350
0.5664990
90.1
93.69801
-3.5980075
98.6
91.40640
7.1935977
95.4
91.40640
3.9935977
91.6
93.12511
-1.5251062
95.6
91.40640
4.1935977
97.6
96.27606
1.3239368
93.1
94.84381
-1.7438100
96.9
95.13026
1.7697393
103.1
95.70316
7.3968380
99.9
96.39064
3.5093565
95.1
95.98961
-0.8896126
94.5
94.84381
-0.3438100
102.5
97.70832
4.7916836
91.3
93.12511
-1.8251062
95.7
91.97930
3.7206964
91.3
94.55736
-3.2573594
92.0
93.41156
-1.4115568
96.9
93.98446
2.9155419
93.5
93.41156
0.0884432
90.4
91.97930
-1.5793036
93.3
91.40640
1.8935977
94.1
93.41156
0.6884432
98.0
93.12511
4.8748938
91.9
92.55220
-0.6522049
92.8
94.27091
-1.4709087
85.9
88.82835
-2.9283465
82.5
89.68770
-7.1876985
88.7
90.26060
-1.5605997
93.8
93.69801
0.1019925
92.4
93.69801
-1.2980075
93.6
90.83350
2.7664990
86.5
89.11480
-2.6147972
85.8
89.11480
-3.3147972
86.7
90.83350
-4.1335010
90.6
91.69285
-1.0928530
86.0
89.68770
-3.6876985
90.0
89.40125
0.5987522
88.4
88.82835
-0.4283465
89.5
95.41671
-5.9167113
88.2
92.26575
-4.0657542
98.5
95.98961
2.5103874
89.6
92.83866
-3.2386555
97.7
91.11995
6.5800483
92.6
91.40640
1.1935977
97.8
93.69801
4.1019925
90.7
91.40640
-0.7064023
89.2
89.68770
-0.4876985
91.8
90.83350
0.9664990
91.6
93.41156
-1.8115568
94.8
90.26060
4.5394003
91.0
91.97930
-0.9793036
93.2
90.83350
2.3664990
93.3
92.26575
1.0342458
89.5
89.40125
0.0987522
88.6
89.97415
-1.3741491
92.4
93.69801
-1.2980075
91.5
89.97415
1.5258509
93.6
93.69801
-0.0980075
Residual
Formally, we refer to residuals as :
for the \(i^{th}\) observation, residual \(e_i= y_i - \hat{y}_i\)
DIY-1 {2 mins}
Use the linear model \(\hat{y} = 41 + 0.59x\) to compute the residual for the observation (76.0, 85.1).
DIY-2
If a model underestimates an observation, will the residual be positive or negative?
Residuals
Least Squares
Provides and Objective measure of finding the best line
A line that has the smallest residuals
\(e_1^2 + e_2^2 + ... + e_n^2\)
Interpretation
For a model:
\(\hat{y} = \beta_0 + \beta_1 x\)
The slope(\(\beta_1\)) describes the estimated difference in the predicted average outcome of \(y\) if the predictor variable \(x\) happened to be one unit larger.
The intercept describes the average outcome of \(y\) if \(x = 0\)
Extrapolation
family_income
gift_aid
price_paid
92.922
21.720
14.280
0.250
27.470
8.530
53.092
27.750
14.250
50.200
27.220
8.780
137.613
18.000
24.000
47.957
18.520
23.480
113.534
13.000
23.000
168.579
13.000
29.000
208.115
14.000
28.000
12.523
25.470
16.530
119.822
21.000
15.000
50.563
17.476
18.524
16.120
22.470
13.530
206.932
11.000
25.000
68.678
25.720
16.280
73.598
32.720
9.280
218.120
23.000
19.000
89.983
16.000
20.000
271.974
20.000
22.000
118.165
24.000
18.000
108.395
15.500
26.500
235.522
7.000
35.000
78.926
20.000
16.000
76.854
23.520
18.480
98.496
14.000
22.000
134.586
10.000
32.000
75.157
21.120
20.880
135.857
21.000
21.000
79.448
27.500
14.500
80.858
20.550
15.450
86.140
14.300
27.700
40.490
18.320
23.680
143.337
18.000
24.000
97.664
10.000
26.000
74.713
21.000
15.000
178.795
13.600
22.400
71.550
20.470
15.530
92.605
21.000
15.000
62.546
21.600
14.400
0.000
27.470
14.530
159.981
25.814
10.186
40.397
25.970
10.030
85.203
25.558
16.442
27.164
20.470
21.530
146.397
17.000
25.000
14.089
20.420
21.580
217.443
20.000
22.000
140.093
15.000
21.000
104.147
17.560
24.440
83.333
23.500
18.500
Extrapolation
lm(gift_aid ~ family_income,1data = elmhurst) |>2 broom::augment() |>ggplot(aes(x = family_income, gift_aid)) +geom_point(colour ="steelblue") +geom_smooth(method ="lm",se = F,linetype =2)+geom_segment(aes(xend = family_income, yend=.fitted),colour ="red", alpha =0.5) +labs(x ="Family Income in 1000 USD",y ="Gift Aid in 1000 USD" ) +theme_minimal()
1
Regression to estimate gift aid received using family income
2
use the model to generate data frame for predicted and residual values along with other estimates.
Analyse with and without the outliers. Are the results different? Think why?
Present the differneces for discussion.
Do not remove these without good reason.
High leverage Points
“Points that fall horizontally away from the center of the cloud tend to pull harder on the line, so we call them points with high leverage or leverage points.”
if such points does affect the slope of the line we call these Influential points.
How are these model fits different?
ex1-ims
how would the model fit look for these residual plots?
ex2-ims
Multiple Linear Model
When many variables are associated with the response at once
Why do we need Multiple Linear Model
Why can’t I run several simple liner models?
How would you make a single prediction form many simple linear model?
Each simple linear model will ignore other factors that are associated with the response.
Multiple Linear Model
lm(head_l ~ total_l + sex + age , data = possum) |> broom::tidy() |> kableExtra::kable()
“The adjusted R-squared adjusts for the number of terms in the model. Importantly, its value increases only when the new term improves the model fit more than expected by chance alone. The adjusted R-squared value actually decreases when the term doesn’t improve the model fit by a sufficient amount.”
Prediction
\(\beta_0, \beta_1,...\beta_p\) are estimates of population parameters. It is related ot the reducible error we talked about in the Bias-variance tradeoff. To address this uncertainty we use confidence Intervals. This is how close \(\hat{Y}\) is to \(f(X)\).
Even if we got perfect estimates of the paramenters, we have to deal with the irreducible error () that is hidden in every realization of \(Y\). To indicate this we use prediction intervals. This is how much \(Y\) vary from \(\hat{Y}\).